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We develop the statistical methods for comparing two sets of arrival directions 
of cosmic rays in which the two-dimensional distribution of arrival directions is re- 
duced to the one-dimensional distributions so that the standard one-dimensional 
Kolmogorov-Smirnov test can be applied. Then we apply them to the analysis 
of correlation between the ultra-high energy cosmic rays (UHECR) with energies 
above 5.7 x 10 19 eV, observed by Pierre Auger Observatory (PAO) and Akeno Giant 
Air Shower Array (AGASA), and the active galactic nuclei (AGN) within the dis- 
tance 100 Mpc. For statistical test, we set up the simple AGN model for UHECR 
sources in which a certain fraction of observed UHECR are originated from AGN 
within a chosen distance, assuming that all AGN have equal UHECR luminosity 
and smearing angle, and the remaining fraction are from the isotropic background 
contribution. For the PAO data, our methods exclude not only a hypothesis that 
the observed UHECR are simply isotropically distributed but also a hypothesis that 
they are completely originated from the selected AGN. But, the addition of appro- 
priate amount of isotropic component either through the background contribution 
or through the large smearing effect improves the correlation greatly and makes the 
AGN hypothesis for UHECR sources a viable one. We also point out that restricting 
AGN within the distance bin of 40 — 60 Mpc happens to yield a good correlation 
without appreciable isotropic component and large smearing effect. For the AGASA 
data, we don't find any significant correlation with AGN. 
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I. INTRODUCTION 



The sources of ultra-high energy cosmic rays (UHECR) still remain in veil, but the 
recent observation of the suppression in the energy spectrum of UHECR above the so-called 
Greisen-Zatsepin-Kuzmin (GZK) cutoff (E GZK ~ 4 x 10 19 GeV) [H, 0] indicates that at 
least the UHECR with energies above the GZK cutoff come from relatively close (within 
the GZK radius ~ 100 Mpc) extragalactic sources. The magnitude and extension of the 
intergalactic magnetic fields are supposed not to be large enough to significantly alter the 
arrival directions of UHECR with these energies. Thus, the distribution of arrival directions 
is correlated with the distribution of the sources and the arrival direction analysis for the 
UHECR with energies above the GZK cutoff can be very useful in identifying their sources. 
An important step toward this direction was initiated by the correlation between arrival 
directions of UHECR and nearby active galactic nuclei (AGN) reported by the Pierre Auger 
Observatory (PAO) [3(1 . Though further analysis with more data weakened the significance 
of the correlation |4|, |5[, it still remains as an important issue. Another important clue for 
the UHECR sources may come from the study of the correlation between the UHECR arrival 
directions and the large scale structures manifested in the galaxy distribution or the matter 
distribution. It was studied by several groups and the results are not quite conclusive 

yet. The positive result will provide the basis for the further study of correlations between 
the UHECR and specific classes of astrophysical objects. 

For the study of correlation between the arrival directions of UHECR and some astro- 
physical objects, we have to rely on the statistical methods. This is because the current 
limit of angular resolution in CR experiments and a poor understanding of the intergalactic 
magnetic fields make it difficult to pin down individual sources of UHECR from their arrival 
directions. This also seems to be unavoidable as we consider that currently the number 
of astrophysical objects which are candidates for UHECR sources is very large while the 
number of observed UHECR events is small yet. 

Statistical studies of correlation between the arrival directions of UHECR and the as- 



trophysical objects was done in many ways [3_H5|, lllHl9l|. In this paper, first we improve 
the previously used methods and combine them to obtain more reliable estimates of the 
significance of correlation. Our improvement is twofold. Firstly, while the previous analyses 
compared the representative values of the two-dimensional distributions of UHECR arrival 
directions, for example the number of UHECR events within the certain angular distance 
from the supposed sources, ours compares the one- dimensional distributions obtained from 
the two-dimensional distributions, which conveys more information in the sense that, com- 
pared with the above example of representative value, this one-dimensional distribution 
contains information about the number of UHECR events within any angular distance, not 
just within a certain specific angular distance. Secondly, the previous analyses focused on 
the deviation from isotropy rather than the direct correlation with the supposed sources 
by taking as their null hypothesis that the UHECR arrival directions are isotropically dis- 
tributed. Here, we take as a null hypothesis the explicit source model for UHECR including 
the isotropic component and focus on the direct correlation of UHECR with the supposed 
sources. The basic idea is that we reduce the two-dimensional distribution of arrival direc- 
tions to the one-dimensional distributions, which can be compared by using the well-known 
Kolmogorov-Smirnov (KS) test. Then, we apply these methods to examine the correlation 
between UHECR arrival directions and AGN. For statistical test, we set up the simple AGN 
model for the UHECR sources, in which all or a fraction of UHECR above a certain en- 
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ergy cutoff are originated from the AGN within a certain distance cut. For simplicity, we 
assume that all selected AGN have the equal luminosity and smearing angle of UHECR. 
The remaining fraction is filled with the isotropic component of UHECR to account for the 
contribution from the outside of the distance cut. We use the AGN data listed in the 12th 
edition of Veron-Cetty and Veron catalog [13] and the UHECR data obtained by PAO [H 
and Akeno Giant Air Shower Array (AGASA) [2l| to cover both the southern and northern 
hemispheres and compare the results drawn from them. 

This paper is organized as follows. In section [Til we introduce three statistical methods 
for comparing two distributions of arrival directions. In section IIIH we explain the simple 
AGN model for the UHECR sources and the details needed for the generation of Monte- 
Carlo events for the model and the statistical comparison with the observed data. Section 
IIVI presents the results of our analysis. We conclude in section [V] 



II. STATISTICAL COMPARISON OF TWO ARRIVAL DIRECTION 

DISTRIBUTIONS 



We want to measure the correlation between the observed UHECR arrival directions 
and the directions of the candidate point sources to examine how plausible the candidate 
sources are. What we actually obtain through statistical analysis is the probability that 
the observed UHECR arrival direction distribution come from the expected UHECR arrival 
direction distribution from a given hypothesis about the sources. This comes from statistical 
comparison of two arrival direction distributions. Quantifying how similar two distributions 



of ob 



ects on the sphere are is a non-trivial problem. Various methods have been discussed 



10l . 1 171 Il8l l22H30j . Because the distribution on the sphere is two dimensional, we 



31, 32 . The drawback of naive 



may apply the two-dimensional version of KS test [22|, 
two-dimensional KS test is that it does not properly care the rotational symmetry that 
the distribution on the sphere has 
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In this paper, we adopt the strategy that we 
reduce the two-dimensional distribution on the sphere to one-dimensional one for which the 
rotational symmetry is kept manifestly and apply the one-dimensional KS test to quantify 
the probability. For the reduction of the distribution on the sphere to one-dimensional 
distribution, we consider three ways described below. 



A. Correlational Angular Distance Distribution 

This is most useful when we consider the set of point sources for UHECR. By the corre- 
lation between two different sets of objects, we mean that nearby each object of one set, we 
can find objects of the other set at a certain level. To quantify the distance on the sphere, 
we use the angular distance 9 between two objects on the sphere which is the angle between 
two unit vectors f and f ' pointing two objects, that is, cos 9 = r • r'. For each object of 
the reference set, say the point source set, we count the number of objects of the other set, 
say the UHECR arrival direction set, within a certain angular distance 9. Then we sum up 
counts over all objects of the reference set and estimate the significance level of this number. 
This way of counting has been used previously in many literature 0, 0, Gil, 23 . 



To quantify the significance level of the correlation, we need to know the probability dis- 
tribution of the expected number of objects. It can be calculated analytically or obtained by 
Monte-Carlo simulation, once a hypothesis about the UHECR sources is set up. The simple 
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test bed is a hypothesis that the UHECR arrival directions are isotropically distributed. 
Then the correlation between the set of sources and the set of UHECR arrival directions 
was claimed based on that we observe more count of UHECR than that expected in the 
simple isotropic distribution. But what this procedure actually proves is that the observed 
UHECR arrival directions are not isotropically distributed. To make a stronger claim than 
simply something more than isotropy we should use an elaborated hypothesis for the test. 

The above idea can be slightly generalized to estimate the significance level of the corre- 
lation between two sets of objects on the sphere. Instead of fixing the angular distance 6 to 
a certain value, we consider the distribution of the count over the whole range of the angular 
distance, that is, from 0° to 180°. We call it the correlational angular distance distribution 
(CADD): 

CADD : {cos%, = r» ■ \ i = 1, . . . , N; j = 1, . . . , M} , (1) 

where are the UHECR arrival directions, r'- are the reference directions, and N and M 
are total numbers of them, respectively. In FIG. HJ we illustrate the basic idea and the usage 
of the CADD. We compare the CADD obtained from the reference set (presumably the set 
of point sources) and the observed UHECR arrival directions with that obtained from the 
same reference set and the simulated UHECR arrival directions from a given hypothesis. 
Because two distributions to be compared are one-dimensional, the KS test can be applied 
to yield the probability that the observed distribution of arrival directions comes from the 
hypothesis. The advantage of this method is that it is very simple in practice and does not 
require arbitrary angular binning. Listing in ascending order of all angular distances between 
pairs of objects in the reference set and the UHECR arrival direction set directly gives the 
cumulative probability distribution of the count number which is used in the KS test. Since 
we directly deal with the cumulative probability distribution, binning in angular distance is 
unnecessary and the expected cumulative probability distribution from the hypothesis can 
be made accurate simply by increasing the number of simulated events enough. 



B. Flux Exposure Value Distribution 

Koers and Tinyakov proposed another method to test the correlation between the sources 
and the observed UHECR j8j. In this method, the two-dimensional distribution of UHECR 
arrival directions is reduced to the one-dimensional distribution of expected flux values at 
UHECR arrival directions. At a given arrival direction, the expected flux value is the product 
of the UHECR flux expected from a given hypothesis for the sources and the exposure 
function at that direction. Thereby, we call it the flux exposure value distribution (FEVD): 

FEVD : {Fi = /(r^fr) | z = 1, . . . , N} , (2) 

where fj are the UHECR arrival directions, N is the total numbers of UHECR, /(f j) and 
h{i i) are the UHECR flux and the exposure function, respectively. It is also very simple in 
practice and does not require arbitrary angular binning. One additional merit of this method 
is that it can be used for the continuous source distribution. For the detailed motivations and 
merits of this method, see Ref. jsf. Here, we take this method for checking and comparison. 



(a) (b) 

FIG. 1. Illustrations showing the basic idea of comparing CADD. (a) CADD is the distribution of 
all angular distances between the reference directions (red dots) and the UHECR arrival directions 
(black dots), (b) When the observed UHECR events are more clustered around the reference 
directions than, say, those of the isotropic distribution, the observed CADD has more counts at 
small angles than that expected from the isotropic distribution. 



C. Auto-Angular Distance Distribution 

The CADD and the FEVD with the one-dimensional KS test respects the rotational 
symmetry. But they reduce a two-dimensional distribution on the sphere to one-dimensional 
distribution of angular distances and flux values. It means that the loss of information occurs. 
The one-dimensional KS test on the CADD catches the correlation of the source directions 
and the CR directions, but it can be faked by, say, the aggregation of UHECR themselves 
around a few point sources. To compensate for this, we may consider the distribution of the 
angular distances of all pairs of UHECR arrival directions, which we call the auto-angular 
distance distribution (AADD): 



A ADD : {cos % = Ti ■ I i, j = 1, . . . , N} , (3) 



where r, are the UHECR arrival directions and N is the total numbers of UHECR. It catches 
the aggregation or spreading of UHECR arrival directions. It was used previously to examine 
the clustering of CR arrival directions 36-3^]. The AADD method may not be so relevant 



for the correlation analysis because it does not measure the correlation directly. However, 
it is still useful to compare two distributions of arrival directions, and to test the hypothesis 
for the sources. It is quite useful when the number of sources is small or a few strong sources 
dominate over the others. 



6 



III. MODELING THE ARRIVAL DIRECTIONS OF UHECR 

For a statistical test, we need to know the expected distribution of UHECR arrival di- 
rections. To get the expected distribution from a given model for the UHECR sources, 
Monte-Carlo simulation is commonly used. In this section, we present some details needed 
for the generation of UHECR arrival directions by using Monte-Carlo simulation and sta- 
tistical comparison with the observed data. 



A. Arrival directions of UHECR 

For statistical analysis of arrival directions, we use the UHECR data reported by PAO 
and by AGASA [Hj. These data sets were selected with certain energy cuts. The energy 
cuts of the reported data sets are 5.7 x 10 19 eV for the PAO data set and 4.0 x 10 19 eV for 
the AGASA data set, respectively. The merits of taking a higher value of energy cut for 
UHECR data in the analysis of positional correlations with astrophysical objects are 1) we 
can minimize the deflection due to the galactic and extragalactic magnetic fields; 2) we can 
reduce the diffuse isotropic components. The diffuse isotropic components may come from 
the contributions of our galaxy and far distant astrophysical object. We can reduce these 
contribution by taking the energy cut above the GZK cutoff, restricting possible sources 
within the GZK radius which is around 100 Mpc. Of course, the drawback is that we have 
a smaller number of data, which reduces the statistical power. So we need to compromise. 
Here, we apply the energy cut E > E c = 5.7 x 10 19 eV for both data sets. With this energy 
cut, we have 27 UHECR events from PAO and 23 UHECR events from AGASA. These are 
shown in FIG. d 

The detector array does not cover the sky uniformly and we must consider its efficiency 
as a function of the arrival direction. It depends on the location and the characteristics of 
the detector array. Here we consider only the geometric efficiency which is determined by 
the location and the zenith angle cut of the detector array. Then the exposure function h 
depends only on the declination 5, 

h(S) = — [sin a m cos A cos 5 + a m sin A sin 5} , (4) 

7T 

where A is the latitude of the detector array, 9 m is the zenith angle cut, and 



0, for f > 1, 

oim = { for £ < — 1, with £ 



cos 9 m — sin A sin 5 



-i i- 4.1. • cos A cos 5 

cos 1 4 , otherwise 

The PAO site has A = —35.20° and the zenith angle cut of the data is 9 m = 60°. The 
AGASA site has A = +35.78° and the zenith angle cut of the data is 9 m = 45°. 



B. The Simple AGN Model for UHECR Sources 

Now we consider a hypothesis that all or a certain fraction of UHECR with energy above 
a certain energy cut E c are originated from the AGN within a certain distance cut d c . 

For the correlation analysis, we use AGN listed in the Veron-Cetty and Veron (VCV) 
catalog. The 12th edition of VCV catalog contains 108,080 quasars (M < —23), BL Lacertae 
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FIG. 2. Skymaps of the arrival directions of UHECR, represented by black circles (o), with energy 
E > 5.7 x 10 19 eV observed by PAO (upper panel) and AGASA (lower panel) in the equatorial 
coordinates plotted using the Hammer projection. The solid red lines are the boundaries of the 
sky covered by PAO and AGASA experiments. The blue crosses (x) represent the locations of 
AGN with distance d < 100 Mpc from the VCV catalog. The blue square (■) shows the location 
of Centaurus A. 



objects and AGN (M > —23) (20|. The catalog includes a variety of contents of each object, 
for example, the equatorial coordinates, magnitude, color index, redshift factor and so on. 
The catalog may not be a complete list of AGN we need. But we believe it covers enough 
number of AGN so that the statistical analysis will yield the useful information. 

Because we apply the energy cut to UHECR data which is higher than the GZK cutoff, 
most of probable sources of them are expected to lie within the GZK radius. With this fact 
in mind and for the comparison with Refs. (I, H|, we pick up the AGN with the distance 
less than or equal to 100 Mpc (corresponding to the redshift z < 0.024) for the correlation 
test. The original number of AGN within 100 Mpc in the VCV catalog is 694. This includes 
7 AGN with zero redshift, which are problematic to be included in our analysis. Thus, we 
eliminate these seven AGN from our AGN data set and the remaining 687 AGN will be used 
in our analysis. We show their locations in FIG. [2] 

The UHECR luminosity would vary AGN by AGN in general. We now have a unified 
picture of AGN as the supermassive black hole surrounded by accreting materials. It is 
also known that the efficient particle acceleration can take place in the vicinity of the black 



hole and in the jet up to energies above 10 eV through several different mechanisms [40 



s 



However, it is still difficult to infer or to model the expected UHECR fluxes of all AGN 
due to limited measurements done so far and theoretical uncertainties. Thus, without a 
priori information, we assume that all AGN have the equal UHECR luminosity. Then, the 
UHECR flux from each AGN is just inversely proportional to the square of the distance to 
it. 

We are considering AGN as the point sources of UHECR. However, the trajectories of 
UHECR are bent by intervening magnetic fields. Thus, the arrival directions of UHECR 
deviate from the exact AGN positions. (We also need to consider the uncertainty coming 
from the limitation of the angular resolutions of detector arrays.) To take this effect into 
account, we consider AGN as the smeared sources of UHECR. The smearing effect also varies 
AGN by AGN as well as UHECR by UHECR. But, for simplicity, we assume that all AGN 
are smeared sources having gaussian flux distribution about the locations with a common 
angular width 9 S , called the smearing angle. Here, the smearing angle, 9 S , is taken to be 
a free parameter. Semi-analytic analysis and numerical simulations of large scale structure 
formation indicated that the deflection of UHECR with E > 4.0 x 10 19 eV is at best 2° ~ 3° 



4ll . |42| . However, we take 6° as a fiducial value of 9 S to get the rather conservative constraint 



on the simple AGN model [10 



Now the UHECR flux distribution of each AGN is given by 

where L is the common UHECR luminosity, di is the distance to the AGN, and 9 is the an- 
gular distance from the AGN position. The value of L is related to the total flux of UHECR, 
which we do not address here. Its precise value is not important for our purpose because 
it is equal for all AGN and our test compares the normalized probability distributions of 
UHECR events over the sphere. 

The UHECR with energy above the energy cut E c = 5.7 x 10 19 eV still can come from the 
sources lying outside the distance cut d c = 100 Mpc. The fraction of UHECR with E > E c 
coming from the sources with d < d c can be estimated as a function of E c and d c by solving 
the cosmic ray propagation equation. We denote this fraction as Jq. For E c = 5.7 x 10 19 eV 
and d c = 100 Mpc, the estimated value is f ~ 0.7 jsj]. 

In the simple AGN model considered here, we assume that the fraction / of observed 
UHECR with E > E c are originated from the AGN within the distance d c and the remaining 
fraction 1 — / from isotropic background contributions. We call / the AGN fraction and 
treat it as a free parameter of the model, while its fiducial value is taken to be / = /q. In 
FIG. El we show the distributions of the UHECR arrival directions expected from the simple 
AGN model with / = 1 and / = 0.45 as observed by PAO. 



IV. THE CORRELATION OF AGN AND UHECR DATA 

To illustrate our statistical methods, we take the set of selected AGN as our reference 
set and analyze the correlation between AGN and the arrival directions of UHECR by using 
the CADD method. 

In FIG. [4] we compare the cumulative probability distribution (CPD) of CADD between 
the selected AGN (with d < 100 Mpc) and the arrival directions of UHECR from PAO 
data (with E > 5.7 x 10 19 eV) with those expected from the simple AGN model which 
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-90° 



FIG. 3. The distributions of simulated UHECR arrival directions (2700 events, represented by red 
dots (.)) obtained from the simple AGN model with / = 1 (upper panel) and / = 0.45 (lower 
panel) in PAO case. Mapping and other symbols are same as in FIG. [2j 



is obtained through Monte-Carlo simulations. Because the KS test is reparametrization 
invariant, we use the variable x = cos 8 instead of the angular distance 8 in the horizontal 
axis. The CPD for the PAO data is the black line. The green line represents the CPD for 
the isotropic distribution (corresponding to / = 0) of UHECR and the blue and red lines 
represent the CPDs for the simple AGN model with a smearing angle of 6° for both and the 
source fraction / = 1 and / = 0.45, respectively. For comparing two different cumulative 



distribution functions S^x) and Sn 2 {x), the KS statistic is [43 



D = max\S Nl (x) — S N2 (x)\ , (6) 

thus it can be directly read off from the CPDs. From the KS statistic D of CADD, the 
probability that the PAO data are yielded from the given model is given approximately by 

Probability = Q KS ( v 7 ^ + 0.12 + O.ll/v 7 ^ d) , (7) 

where QksW = 2 J^°^ 1 (— l) J ~ 1 e _2j2A2 and N e is the effective number of data points, N e = 
NiN2/(Ni + N2). When the number of observed UHECR events is N Q and the number of 
AGN is M, Ni = N M for CADD, Nj = N (N - l)/2 for AADD, and iVi = N Q for FEVD. 
For N 2 , we replace N Q with N s , the number of simulated UHECR events. We compare 
the observed distribution, say S^, with the expected distribution Sn 2 , which is obtained 
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cos 6 



FIG. 4. The cumulative probability distributions of CADD using the AGN reference set. The 
black line is for the observed PAO data, the green line for the isotropic distribution, and the blue 
and red lines for the simple AGN model with a smearing angle 9 S = 6° and a source fraction 
/ = 1 and / = 0.45, respectively. The vertical bars represent the KS statistics which are the 
maximum differences between two CPDs. The numbers in two right columns in the legend are 
the KS statistics and the probabilities that the observed distribution of PAO arrival directions is 
obtained from the simple AGN models. 

from the model through simulations. The expected distribution can be made accurate by 
increasing the number of simulated UHECR events N s . We set N s = 10 6 for CADD and 
FEVD, and N s = 10 4 for AADD for the reason of practical computation, that is, it gives 
the accuracy sufficient for our purpose in reasonable computation time, though a little bit 
of fluctuations in calculating probabilities still remain. Since N s is much larger than N , 
N e w N±. The values of probability are shown in the upper-left corner in the figure. We 
note that the CPD itself is also useful for comparing the correlations. The steep rise of 
CPD near cos 8 = 1 means the strong correlation at small angles. Comparing the CPDs, we 
see that the PAO data are much more correlated with the AGN at small angles than the 
simple isotropic distribution. The probability that the PAO data result from the isotropic 
distribution is about 10~ 16 . But the PAO data are not correlated with AGN as much as 
required by the simple AGN model with / = 1 and a small smearing angle. For example 
the probability that the PAO data are yielded wholly from the AGN (that is / = 1) with 
the smearing angle of 6° is 2.1 x 10~ 37 , which is extremely small. 

We can do the same kind of analysis now by using the FEVD or the AADD. Different 
methods will give different probabilities for the same hypothesis, showing the difference in 
their discriminating powers. Thus, we combine three methods together to make better and 
reliable conclusions. 
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Source Fraction Source Fraction 

FIG. 5. Source fraction (/) dependence of probabilities from three different analysis methods for 
the AGN and the PAO data. 



A. Dependence on the AGN fraction and the smearing angle 

Now we examine the simple AGN model in detail. Even though we introduced two 
parameters / and 9 S for quite different motivations, their effects on the distribution of 
UHECR arrival directions are not clearly separated because both reducing / and making 9 S 
larger will make the distribution rather isotropic. Let us first consider the source fraction 
dependence of the distribution. In the simple AGN model, while / is the fraction of UHECR 
coming from the selected AGN, the remaining 1 — / is the fraction of UHECR coming from 
the isotropic background. In FIG. El we showed the / dependence of probabilities, obtained 
by using CADD, FEVD, and AADD methods, that the PAO data set and the AGASA 
data set are obtained from the simple AGN model (with AGN within 100 Mpc) with the 
smearing angle 6°. The CADD method gives the most stringent constraint and it clearly 
excludes both limits of / = (complete isotropy) and / = 1 (all UHECR from selected 
AGN). However, a striking result is that for the PAO data set the probability is peaked at 
/ ~ 0.45 with a value 0.064. In the previous section, we mentioned that the energy cut of 
UHECR £ = 5.7x 10 19 eV and the distance cut of AGN d c = 100 Mpc roughly determines 
the fraction of contribution from the inside of the distance cut, / « 0.7. The peak is away 
from this value and we need more isotropic components to get the best probability. For the 
fixed value of / ~ 0.7, a good probability can be achieved by increasing the smearing angle, 
as discussed below. 

For the AGASA data set, we don't see any correlation with AGN. Rather, the FEVD 
and AADD methods indicate that the observed AGASA data are consistent with complete 
isotropy, even though the CADD method still gives maximum but tiny probability. 

The smearing angle 8 S in the simple AGN model is a free parameter, though setting it a 
large value may needs some justification, such as the existence of very strong intergalactic 
magnetic fields. We also examine the smearing-angle dependence of the correlation analysis 
and show the results in FIG. |B]and[7| Here again the CADD method plays a main role, while 
the FEVD method is sensitive at small smearing angles. As the smearing angle increases, 
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Angular Width Angular Width 



FIG. 6. Smearing-angle (0 S ) dependence of probabilities from three different analysis methods for 
the PAO data (left) and the AG AS A data (right) with the source fraction / = 1. In the AGASA 
result (right), the CADD line is not seen because it lies below 10 -40 . 



the distribution of the arrival directions tends to be isotropic. In the above, we saw that the 
addition of isotropic component to some extent improves the probability for the PAO data 
set. Thus, we expect that increasing the smearing angle will also bring an improvement. 
For / = 1 (all UHECR from selected AGN), this happens for a rather large value of the 
smearing angle and the probability is peaked at 9 S = 48° with a value 0.016 (See the left 
panel of FIG. [6]). For / = 0.7, which is the reference value for E c = 5.7 x 10 19 eV and 
d c = 100 Mpc, the peak value is 0.019 at 9 S = 36° (See the left panel of FIG. EJ. If we set 
/ = 0.45, the small smearing angle is preferred (See the right panel of FIG. [7|). 



B. Dependence on Distance-Bin 

In the above correlation analysis, the number of observed UHECR is much smaller than 
the number of AGN. Thus, only some subset of AGN can actually be responsible for observed 
UHECR. Therefore, it is legitimate to classify AGN in some way to narrow down the possible 
sources of UHECR among them. Unfortunately, we don't have yet plausible criteria to 
classify AGN into possible UHECR sources and not. This is the reason why we included 
all AGN within a distance cut assuming the equal UHECR luminosity for all AGN in the 
previous analysis. 

Now we may consider a sophisticated hypothesis that AGN with certain properties or 
within a certain region are actually responsible for the UHECR. As for the first and simple- 
minded step toward this direction, we consider distance binning, though we don't have any 
reasonable motivation yet except the exploratory analysis of the UHECR data. FIG.[H]shows 
the probabilities by three analysis methods within the simple AGN model with equally sized 
20 Mpc distance-binning, for which we set / = 1, 9 S — 6° and use only those AGN within 
each bin as UHECR sources. While we have tried several different bin sizes of 10, 20 Mpc 
etc., here we present the results for the 20 Mpc bin size because it gives most interesting 
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FIG. 7. Smearing-angle (0 S ) dependence of probabilities from three different analysis methods for 
the PAO data with the source fraction / = 0.7 (left) and / = 0.45 (right). 



results. For the AG AS A data, nothing interesting happens. But, quite interestingly, we find 
that with the PAO data the bin with 40 Mpc < d < 60 Mpc gives remarkable probabilities 
for all three methods, and this is manifestly seen with the bin size of 20 Mpc. We may 
further try a single bin of arbitrary size to find the best AGN set for the PAO data. We 
find that the probability is maximized with the value Pcadd = 0.27 when we select AGN 
within 41 — 63 Mpc bin. For the comparison with FIG. [21 FIG. M shows the same skymap 
as the upper panel in FIG. [2] but for the AGN in the distance bin 40 Mpc < d < 60 Mpc. 
We report this result just because it looks interesting, though it seems difficult to make an 
appropriate interpretation yet. 



V. DISCUSSION AND CONCLUSION 

The previous analysis of the correlation between AGN and the arrival directions of 
UHECR done by PAO group was based on the number of UHECR events correlated with 
AGN, where 'correlated' means that UHECR lie within a certain angular distance ip from 
AGN @, |j. They used the data sets, AGN with d < 71 Mpc and UHECR with E > 
5.7 x 10 19 eV, and the angular distance ip = 3.2° which minimize the probability of the 
null hypothesis that the distribution of UHECR arrival directions is isotropic. Under this 
condition, they found 20 out of 27 UHECR events are correlated with AGN. The chance 
probability that this number occurs from the isotropic distribution is 4.6 x 10~ 9 . Instead of 
the correlated UHECR events, our methods compares the reduced one- dimensional distri- 
butions, CADD, AADD, and FEVD of the observed UHECR events and the expected ones 
from the model, or the hypothesis. When applied to the same data sets, our methods yield 
the chance probabilities Pcadd ~ 2.1 x 10~ 9 , Paadd ~ 4.9 x 10~ 2 , and Pfevd ~ 2.5 x 10^ 3 
from the isotropic distribution. These show that CADD is much better than FEVD and 
AADD in revealing anisotropy of the PAO data. The CADD method gives the result con- 
sistent with that of the PAO group. The advantage of using CADD is that we can avoid the 
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FIG. 9. Skymap of AGN with 40 Mpc < d < 60 Mpc and the arrival directions of UHECR with 
£>5.7x 10 19 eV observed by PAO. 



arbitrariness in tuning the size of ip to minimize the chance probability. 

Even though the null hypothesis of isotropic distribution can be ruled out in these ways, 
it does not directly test the correlation between AGN and UHECR. This fact was already 
pointed out in Ref. {23|. For the direct test of correlation, we must set up an alternative 
hypothesis that AGN are the sources of UHECR. The null hypothesis we adopted is rather 
simple-minded in that all AGN listed in the catalog are treated as equal sources of UHECR. 
Our test reveals that the hypothesis that all UHECR with E > 5.7 x 10 19 eV observed 
by PAO come from AGN with d < 71 Mpc is also ruled out by the chance probability of 
-Pcadd = 1-3 x 10~ 24 . It says that the observed UHECR are more correlated with AGN 
than the simple isotropic distribution, but not sufficiently correlated with AGN as required 



by that hypothesis. Our result is consistent with that of Ref. [23 . 

In conclusion, we developed the statistical methods for comparing two sets of arrival 
directions of cosmic rays, in which the two-dimensional distribution of arrival directions is 
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reduced to the one-dimensional distributions such as CADD, AADD, and FEVD. Then the 
probability that the observed arrival direction distribution comes from the given hypothesis 
can be calculated by the standard one-dimensional KS statistic. We applied three methods 
together in combination to the analysis of correlation between UHECR and AGN to get 
improved and reliable conclusions. We used UHECR with energies above 5.7 x 10 19 eV 
observed by PAO and AGASA, and AGN within the distance 100 Mpc listed in the 12th 
edition of VCV catalog. As our analysis showed, CADD is more useful than FEVD and 
AADD in testing the anisotropy and the correlation with AGN. 

We set up the simple AGN model which assumes that all AGN within a chosen distance 
cut are the equal sources of UHECR having the same UHECR luminosity and smearing angle. 
For the PAO data, our methods exclude both limits that the observed UHECR are simply 
isotropically distributed and that they are completely originated from the selected AGN. 
However, we need to add the isotropic component to incorporate the contribution from the 
outside of the distance cut and the smearing effect due to the intergalactic magnetic fields. 
Increasing isotropic component either through the background contribution or through the 
large smearing effect improves the situation greatly and makes the AGN hypothesis for 
UHECR sources a viable one. But the required amount of the isotropic component is more 
than the estimated contribution from the outside of the distance cut. Thus, the rather large 
smearing angle is needed. We also point out that restricting AGN within the distance bin 
of 40—60 Mpc happens to yield a good probability without appreciable isotropic component 
and large smearing effect. 

For the AGASA data, we don't find any significant correlation with AGN. Thus, as for 
the AGN hypothesis for the UHECR sources, we have quite different and even conflicting 
conclusions from PAO and AGASA experiments which mainly covers the southern sky and 
the northern sky, respectively. Of course, we cannot completely exclude the possibility that 
the observed correlation with AGN for the PAO data is just a mere chance. Now, the 
northern sky is being covered by Telescope Array (TA) experiment. We hope the UHECR 
data from TA will resolve this problem. 

[Note Added] As this work is completed, PAO released the new data of UHECR [H| . Also 



new catalogs of AGN are available now 1451 1. The results for these new data sets will be 
presented in a separate paper |46| . 
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